Konuşma Tanima İçi̇n Heteroskedasti̇k Ayirtaç Anali̇zi̇ni̇n Düzenli̇leşti̇ri̇lmesi̇ Regularizing Heteroschedastic Discriminant Analysis for Speech Recognition

نویسنده

  • Hakan Erdoğan
چکیده

Linear Discriminant Analysis (LDA) followed by a diagonalizing maximum likelihood linear transform (MLLT) applied to spliced static MFCC features yields important performance gains as compared to MFCC+dynamic features in most speech recognition tasks. It is reasonable to regularize LDA transform computation for stability. In this paper, we regularize LDA and heteroschedastic LDA transforms using two methods: (1) Statistical priors for estimating the transform, (2) Structural constraints on the transform. Our structural constraint imposes a block structured LDA transform where each block acts on the same cepstral parameters across frames. The second approach suggests using new coefficients for static, first difference and second difference operators as compared to the standard ones. We test the new algorithms on two different tasks, TIMIT and AURORA2. We obtain consistent improvement over standard MFCC features. We also improve upon LDA+MLLT features for certain noise levels in AURORA2 tests.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Regularizing linear discriminant analysis for speech recognition

Feature extraction is an essential first step in speech recognition applications. In addition to static features extracted from each frame of speech data, it is beneficial to use dynamic features (called ∆ and ∆∆ coefficients) that use information from neighboring frames. Linear Discriminant Analysis (LDA) followed by a diagonalizing maximum likelihood linear transform (MLLT) applied to spliced...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...

متن کامل

Subspace Kernel Discriminant Analysis for Speech Recognition

Kernel Discriminant Analysis (KDA) has been successfully applied to many pattern recognition problems. KDA transforms the original problem into a space of dimension N where N is the number of training vectors. For speech recognition, N is usually prohibitively high increasing computational requirements beyond current computational capabilities. In this paper, we provide a formulation of a subsp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005